Sparse high-dimensional linear regression. Estimating squared error and a phase transition

نویسندگان

چکیده

We consider a sparse high-dimensional regression model where the goal is to recover k-sparse unknown binary vector β∗ from n noisy linear observations of form Y=Xβ∗+W∈Rn X∈Rn×p has i.i.d. N(0,1) entries and W∈Rn N(0,σ2) entries. In high signal-to-noise ratio regime sublinear sparsity regime, while order sample size needed information-theoretically known be n∗:=2klogp/log(k/σ2+1), no polynomial-time algorithm succeed unless n>nalg:=(2k+σ2)logp. this work, we offer series results investigating multiple computational statistical aspects recovery task in n∈[n∗,nalg]. First, establish novel information-theoretic property MLE problem happening around n=n∗ samples, which coin as an “all-or-nothing behavior”: when n>n∗ it recovers almost perfectly support β∗, if n0 nCnalg disappears. geometric “disconnectivity” property, initially appeared theory spin glasses suggest algorithmic occurs. Finally, using certain technical obtained transition, additionally various positive negative interest, including failure LASSO with access success simple local search method samples.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

High Dimensional Regression with Binary Coefficients. Estimating Squared Error and a Phase Transtition

We consider a sparse linear regression model Y = Xβ∗ + W where X is n × p matrix Gaussian i.i.d. entries, W is n× 1 noise vector with i.i.d. mean zero Gaussian entries and standard deviation σ, and β∗ is p × 1 binary vector with support size (sparsity) k. Using a novel conditional second moment method we obtain a tight up to a multiplicative constant approximation of the optimal squared error m...

متن کامل

Nearly Optimal Minimax Estimator for High Dimensional Sparse Linear Regression

We present estimators for a well studied statistical estimation problem: the estimation for the linear regression model with soft sparsity constraints (`q constraint with 0 < q ≤ 1) in the high-dimensional setting. We first present a family of estimators, called the projected nearest neighbor estimator and show, by using results from Convex Geometry, that such estimator is within a logarithmic ...

متن کامل

Robust Estimation in Linear Regression with Molticollinearity and Sparse Models

‎One of the factors affecting the statistical analysis of the data is the presence of outliers‎. ‎The methods which are not affected by the outliers are called robust methods‎. ‎Robust regression methods are robust estimation methods of regression model parameters in the presence of outliers‎. ‎Besides outliers‎, ‎the linear dependency of regressor variables‎, ‎which is called multicollinearity...

متن کامل

Robust High-Dimensional Linear Regression

The effectiveness of supervised learning techniques has made them ubiquitous in research and practice. In high-dimensional settings, supervised learning commonly relies on dimensionality reduction to improve performance and identify the most important factors in predicting outcomes. However, the economic importance of learning has made it a natural target for adversarial manipulation of trainin...

متن کامل

Estimating a Bounded Normal Mean Relative to Squared Error Loss Function

Let be a random sample from a normal distribution with unknown mean and known variance The usual estimator of the mean, i.e., sample mean is the maximum likelihood estimator which under squared error loss function is minimax and admissible estimator. In many practical situations, is known in advance to lie in an interval, say for some In this case, the maximum likelihood estimator...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Annals of Statistics

سال: 2022

ISSN: ['0090-5364', '2168-8966']

DOI: https://doi.org/10.1214/21-aos2130